Tetris: Experiments with the LP Approach to Approximate DP

نویسندگان

  • Vivek F. Farias
  • Benjamin Van Roy
چکیده

We study the linear programming (LP) approach to approximate dynamic programming (DP) through experiments with the game of Tetris. Our empirical results suggest that the performance of policies generated by the approach is highly sensitive to how the problem is formulated and the discount factor. Furthermore, we find that, using a state-sampling scheme of the kind proposed in [7], the simulation time required to generate an adequate number of constraints far exceeds the time taken to solve the resulting LP. As an extension to the standard approximate LP approach, we examine a bootstrapped version wherein a sequence of LPs is solved, with the policy generated by each solution being used to sample constraints for the next LP. Our empirical results demonstrate that this bootstrapped approach can amplify performance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Smoothed Approximate Linear Program

We present a novel linear program for the approximation of the dynamic programming costto-go function in high-dimensional stochastic control problems. LP approaches to approximate DP have typically relied on a natural ‘projection’ of a well studied linear program for exact dynamic programming. Such programs restrict attention to approximations that are lower bounds to the optimal cost-to-go fun...

متن کامل

Approximate Dynamic Programming via a Smoothed Linear Program

We present a novel linear program for the approximation of the dynamic programming costto-go function in high-dimensional stochastic control problems. LP approaches to approximate DP have typically relied on a natural ‘projection’ of a well studied linear program for exact dynamic programming. Such programs restrict attention to approximations that are lower bounds to the optimal cost-to-go fun...

متن کامل

A Smoothed Approximate Linear Program

We present a novel linear program for the approximation of the dynamic programming cost-to-go function in high-dimensional stochastic control problems. LP approaches to approximate DP naturally restrict attention to approximations that are lower bounds to the optimal cost-to-go function. Our program – the ‘smoothed approximate linear program’ – relaxes this restriction in an appropriate fashion...

متن کامل

Approximate modified policy iteration and its application to the game of Tetris

Modified policy iteration (MPI) is a dynamic programming (DP) algorithm that contains the two celebrated policy and value iteration methods. Despite its generality, MPI has not been thoroughly studied, especially its approximation form which is used when the state and/or action spaces are large or infinite. In this paper, we propose three implementations of approximate MPI (AMPI) that are exten...

متن کامل

A learning algorithm based on $λ$-policy iteration and its application to the video game "tetris attack"

We present an application of the λ -policy iteration, an algorithm based on neuro-dynamic programming (described by Bertsekas and Tsitsiklis [BT96]) to the video game Tetris Attack in the form of an automated player. To this end, we ®rst introduce the theoretical foundations underlying the method and model the game as a dynamic programming problem. Afterwards, we perform multiple experiments us...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003